Automatic Extraction of Turkish Hypernym-Hyponym Pairs From Large Corpus

نویسندگان

  • Savas Yildirim
  • Tugba Yildiz
چکیده

In this paper, we propose a fully automatic system for acquisition of hypernym/hyponymy relations from large corpus in Turkish Language. The method relies on both lexico-syntactic pattern and semantic similarity. Once the model has extracted the seeds by using patterns, it applies similarity based expansion in order to increase recall. For the expansion, several scoring functions within a bootstrapping algorithm are applied and compared. We show that a model based on a particular lexico-syntactic pattern for Turkish Language can successfully retrieve many hypernym/hyponym relations with high precision. We further demonstrate that the model can statistically expand the hyponym list to go beyond the limitations of lexico-syntactic patterns and get better recall. During the expansion phase, the hypernym/hyponym pairs are automatically and incrementally extracted depending on their statistics by employing various association measures and graph-based scoring. In brief, the fully automatic model mines only a large corpus and produces is-a relations with promising precision and recall. To achieve this goal, several methods and approaches were designed, implemented, compared and evaluated.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Extraction of Dutch Hypernym-Hyponym Pairs

In this study, we apply pattern-based methods to text for extracting lexical data, in particular the hypernymy relation. We automatically derive thousands of interesting lexical patterns like such NP as NP and evaluate the performance of these patterns by comparing the information they extract from a newspaper corpus with the information in the Dutch part of EuroWordNet. Additionally we investi...

متن کامل

To Use a Treebank or Not – Which Is Better for Hypernym Extraction?

We compare two processing methods for a single natural language processing task. One uses a treebank created with a full parser while the other restricts itself to lexical and part-of-speech information. We show that for the task under investigation, automatic extraction of hypernym-hyponym pairs from text, the former does not outperform the latter. We compare the output of the two approaches a...

متن کامل

Japanese Hyponymy Extraction based on a Term Similarity Graph

Semantic relations between words, such as hyponymy, synonymy and meronymy, have various information access applications (e.g. Web search) and the automatic extraction of such relations from corpora is an important research problem in natural language processing. For the Japanese language, there exist several linguistic resources that contain these relations, such as the Japanese Wordnet, Nihong...

متن کامل

CS 224N Class Project Automatic Hypernym Classification

Hypernym classification is the task of deciding whether, given two words, one word “is a kind of” the other. We present a classifier that learns the noun hypernym relation based on automatically-discovered lexico-syntactic patterns between a set of provided hyponym/hypernym noun pairs. This classifier is shown to outperform two previous methods for automatically identifying hypernym pairs (usin...

متن کامل

NTNU: An Unsupervised Knowledge Approach for Taxonomy Extraction

Taxonomy structures are important tools in the science of classification of things or concepts, including the principles that underlie such classification. This paper presents an approach to the problem of taxonomy construction from texts focusing on the hyponym-hypernym relation between two terms. Given a set of terms in a particular domain, the approach in this study uses Wikipedia and WordNe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012